Howard County
A new initialisation to Control Gradients in Sinusoidal Neural network
Combette, Andrea, Venaille, Antoine, Pustelnik, Nelly
Proper initialisation strategy is of primary importance to mitigate gradient explosion or vanishing when training neural networks. Yet, the impact of initialisation parameters still lacks a precise theoretical understanding for several well-established architectures. Here, we propose a new initialisation for networks with sinusoidal activation functions such as \texttt{SIREN}, focusing on gradients control, their scaling with network depth, their impact on training and on generalization. To achieve this, we identify a closed-form expression for the initialisation of the parameters, differing from the original \texttt{SIREN} scheme. This expression is derived from fixed points obtained through the convergence of pre-activation distribution and the variance of Jacobian sequences. Controlling both gradients and targeting vanishing pre-activation helps preventing the emergence of inappropriate frequencies during estimation, thereby improving generalization. We further show that this initialisation strongly influences training dynamics through the Neural Tangent Kernel framework (NTK). Finally, we benchmark \texttt{SIREN} with the proposed initialisation against the original scheme and other baselines on function fitting and image reconstruction. The new initialisation consistently outperforms state-of-the-art methods across a wide range of reconstruction tasks, including those involving physics-informed neural networks.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Texas > Howard County (0.04)
- Europe > Switzerland (0.04)
- (2 more...)
- North America > United States > Texas > Howard County (0.04)
- Asia > China (0.04)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.93)
SurveyGen: Quality-Aware Scientific Survey Generation with Large Language Models
Bao, Tong, Nayeem, Mir Tafseer, Rafiei, Davood, Zhang, Chengzhi
Automatic survey generation has emerged as a key task in scientific document processing. While large language models (LLMs) have shown promise in generating survey texts, the lack of standardized evaluation datasets critically hampers rigorous assessment of their performance against human-written surveys. In this work, we present SurveyGen, a large-scale dataset comprising over 4,200 human-written surveys across diverse scientific domains, along with 242,143 cited references and extensive quality-related metadata for both the surveys and the cited papers. Leveraging this resource, we build QUAL-SG, a novel quality-aware framework for survey generation that enhances the standard Retrieval-Augmented Generation (RAG) pipeline by incorporating quality-aware indicators into literature retrieval to assess and select higher-quality source papers. Using this dataset and framework, we systematically evaluate state-of-the-art LLMs under varying levels of human involvement - from fully automatic generation to human-guided writing. Experimental results and human evaluations show that while semi-automatic pipelines can achieve partially competitive outcomes, fully automatic survey generation still suffers from low citation quality and limited critical analysis.
- North America > Canada > Alberta (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Middle East > Jordan (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Materials > Chemicals (0.67)
Simulation-informed deep learning for enhanced SWOT observations of fine-scale ocean dynamics
Cutolo, Eugenio, Granero-Belinchon, Carlos, Thiraux, Ptashanna, Wang, Jinbo, Fablet, Ronan
Oceanic processes at fine scales are crucial yet difficult to observe accurately due to limitations in satellite and in-situ measurements. The Surface Water and Ocean Topography (SWOT) mission provides high-resolution Sea Surface Height (SSH) data, though noise patterns often obscure fine scale structures. Current methods struggle with noisy data or require extensive supervised training, limiting their effectiveness on real-world observations. We introduce SIMPGEN (Simulation-Informed Metric and Prior for Generative Ensemble Networks), an unsupervised adversarial learning framework combining real SWOT observations with simulated reference data. SIMPGEN leverages wavelet-informed neural metrics to distinguish noisy from clean fields, guiding realistic SSH reconstructions. Applied to SWOT data, SIMPGEN effectively removes noise, preserving fine-scale features better than existing neural methods. This robust, unsupervised approach not only improves SWOT SSH data interpretation but also demonstrates strong potential for broader oceanographic applications, including data assimilation and super-resolution.
- Southern Ocean (0.05)
- Atlantic Ocean > Mediterranean Sea (0.05)
- North America > United States > Texas > Howard County (0.04)
- (5 more...)
Compositional Generalization Requires More Than Disentangled Representations
Liang, Qiyao, Qian, Daoyuan, Ziyin, Liu, Fiete, Ila
Composition-the ability to generate myriad variations from finite means-is believed to underlie powerful generalization. However, compositional generalization remains a key challenge for deep learning. A widely held assumption is that learning disentangled (factorized) representations naturally supports this kind of extrapolation. Yet, empirical results are mixed, with many generative models failing to recognize and compose factors to generate out-of-distribution (OOD) samples. In this work, we investigate a controlled 2D Gaussian "bump" generation task, demonstrating that standard generative architectures fail in OOD regions when training with partial data, even when supplied with fully disentangled $(x, y)$ coordinates, re-entangling them through subsequent layers. By examining the model's learned kernels and manifold geometry, we show that this failure reflects a "memorization" strategy for generation through the superposition of training data rather than by combining the true factorized features. We show that models forced-through architectural modifications with regularization or curated training data-to create disentangled representations in the full-dimensional representational (pixel) space can be highly data-efficient and effective at learning to compose in OOD regions. These findings underscore that bottlenecks with factorized/disentangled representations in an abstract representation are insufficient: the model must actively maintain or induce factorization directly in the representational space in order to achieve robust compositional generalization.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (10 more...)
WavePulse: Real-time Content Analytics of Radio Livestreams
Mittal, Govind, Gupta, Sarthak, Wagle, Shruti, Chopra, Chirag, DeMattee, Anthony J, Memon, Nasir, Ahamad, Mustaque, Hegde, Chinmay
Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > New York > Kings County > New York City (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (215 more...)
- Media > Radio (1.00)
- Leisure & Entertainment (1.00)
- Government > Voting & Elections (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
Clifford-Steerable Convolutional Neural Networks
Zhdanov, Maksim, Ruhe, David, Weiler, Maurice, Lucic, Ana, Brandstetter, Johannes, Forré, Patrick
We present Clifford-Steerable Convolutional Neural Networks (CS-CNNs), a novel class of $\mathrm{E}(p, q)$-equivariant CNNs. CS-CNNs process multivector fields on pseudo-Euclidean spaces $\mathbb{R}^{p,q}$. They cover, for instance, $\mathrm{E}(3)$-equivariance on $\mathbb{R}^3$ and Poincar\'e-equivariance on Minkowski spacetime $\mathbb{R}^{1,3}$. Our approach is based on an implicit parametrization of $\mathrm{O}(p,q)$-steerable kernels via Clifford group equivariant neural networks. We significantly and consistently outperform baseline methods on fluid dynamics as well as relativistic electrodynamics forecasting tasks.
- Europe > Austria > Vienna (0.14)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > United States > Texas > Howard County (0.04)
- Europe > Austria > Upper Austria > Linz (0.04)
Coupled Ocean-Atmosphere Dynamics in a Machine Learning Earth System Model
Wang, Chenggong, Pritchard, Michael S., Brenowitz, Noah, Cohen, Yair, Bonev, Boris, Kurth, Thorsten, Durran, Dale, Pathak, Jaideep
Seasonal climate forecasts are socioeconomically important for managing the impacts of extreme weather events and for planning in sectors like agriculture and energy. Climate predictability on seasonal timescales is tied to boundary effects of the ocean on the atmosphere and coupled interactions in the ocean-atmosphere system. We present the Ocean-linked-atmosphere (Ola) model, a high-resolution (0.25{\deg}) Artificial Intelligence/ Machine Learning (AI/ML) coupled earth-system model which separately models the ocean and atmosphere dynamics using an autoregressive Spherical Fourier Neural Operator architecture, with a view towards enabling fast, accurate, large ensemble forecasts on the seasonal timescale. We find that Ola exhibits learned characteristics of ocean-atmosphere coupled dynamics including tropical oceanic waves with appropriate phase speeds, and an internally generated El Ni\~no/Southern Oscillation (ENSO) having realistic amplitude, geographic structure, and vertical structure within the ocean mixed layer. We present initial evidence of skill in forecasting the ENSO which compares favorably to the SPEAR model of the Geophysical Fluid Dynamics Laboratory.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Pacific Ocean (0.04)
- North America > United States > Texas > Howard County (0.04)
- (2 more...)
ORCA: A Global Ocean Emulator for Multi-year to Decadal Predictions
Guo, Zijie, Lyu, Pumeng, Ling, Fenghua, Luo, Jing-Jia, Boers, Niklas, Ouyang, Wanli, Bai, Lei
Ocean dynamics plays a crucial role in driving global weather and climate patterns. Accurate and efficient modeling of ocean dynamics is essential for improved understanding of complex ocean circulation and processes, for predicting climate variations and their associated teleconnections, and for addressing the challenges of climate change. While great efforts have been made to improve numerical Ocean General Circulation Models (OGCMs), accurate forecasting of global oceanic variations for multi-year remains to be a long-standing challenge. Here, we introduce ORCA (Oceanic Reliable foreCAst), the first data-driven model predicting global ocean circulation from multi-year to decadal time scales. ORCA accurately simulates the three-dimensional circulations and dynamics of the global ocean with high physical consistency. Hindcasts of key oceanic variables demonstrate ORCA's remarkable prediction skills in predicting ocean variations compared with state-of-the-art numerical OGCMs and abilities in capturing occurrences of extreme events at the subsurface ocean and ENSO vertical patterns. These results demonstrate the potential of data-driven ocean models for providing cheap, efficient, and accurate global ocean modeling and prediction. Moreover, ORCA stably and faithfully emulates ocean dynamics at decadal timescales, demonstrating its potential even for climate projections. The model will be available at https://github.com/OpenEarthLab/ORCA.
- Pacific Ocean > South Pacific Ocean > Tasman Sea (0.04)
- Indian Ocean (0.04)
- Atlantic Ocean > Mediterranean Sea (0.04)
- (10 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
- Information Technology > Modeling & Simulation (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Generative modeling through internal high-dimensional chaotic activity
Fournier, Samantha J., Urbani, Pierfrancesco
Generative models aim to create samples statistically similar to those belonging to a training dataset: their goal is to fit the probability distribution from which the datapoints supposedly come from. In a generic setting, this probability distribution takes the form of a Boltzmann factor. The corresponding Energy Based Models (EBMs) fit the parameters of the Hamiltonian of the Boltzmann distribution and can be viewed as maximum entropy models, where the statistical properties of the dataset are imposed as constraints to low degree correlation functions [1, 2], (see [3, 4] for recent reviews). The resulting learning rule can be viewed as a gradient ascent on the Log-Likelihood (LL). However, running the training dynamics is a notoriously challenging task: at each epoch, the evaluation of the gradient of the LL requires the computation of the correlation functions of the degrees of freedom as predicted from the current estimation of the model's probability distribution. This is typically an intractable problem from an analytical point of view and is generally tackled numerically through parallel Monte Carlo Markov Chain (MCMC) simulations.
- North America > United States > Texas > Howard County (0.04)
- Europe > France (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)